9 research outputs found
IDEnet : Inception-Based Deep Convolutional Neural Network for Crowd Counting Estimation
In crowd counting task, our goals are to estimate density map and count of people from the given crowd image. From our analysis, there are two major problems that need to be solved in the crowd counting task, which are scale invariant problem and inhomogeneous density problem. Many methods have been developed to tackle these problems by designing a dense aware model, scale adaptive model, etc. Our approach is derived from scale invariant problem and inhomogeneous density problem and we propose a dense aware inception based neural network in order to tackle both problems. We introduce our novel inception based crowd counting model called Inception Dense Estimator network (IDEnet). Our IDEnet is divided into 2 modules, which are Inception Dense Block (IDB) and Dense Evaluator Unit (DEU). Some variations of IDEnet are evaluated and analysed in order to find out the best model. We evaluate our best model on UCF50 and ShanghaiTech dataset. Our IDEnet outperforms the current state-of-the-art method in ShanghaiTech part B dataset. We conclude our work with 6 key conclusions based on our experiments and error analysis
CountNet: End to End Deep Learning for Crowd Counting
We approach crowd counting problem as a complex end to end deep learning process that needs both a correct recognition and counting. This paper redefines the crowd counting process to be a counting process, rather than just a recognition process as previously defined. Xception Network is used in the CountNet and layered again with fully connected layers. The Xception Network pre-trained parameter is used as transfer learning to be trained again with the fully connected layers. CountNet then achieved a better crowd counting performance by training it with augmented dataset that robust to scale and slice variations
InstructTODS: Large Language Models for End-to-End Task-Oriented Dialogue Systems
Large language models (LLMs) have been used for diverse tasks in natural
language processing (NLP), yet remain under-explored for task-oriented dialogue
systems (TODS), especially for end-to-end TODS. We present InstructTODS, a
novel off-the-shelf framework for zero-shot end-to-end task-oriented dialogue
systems that can adapt to diverse domains without fine-tuning. By leveraging
LLMs, InstructTODS generates a proxy belief state that seamlessly translates
user intentions into dynamic queries for efficient interaction with any KB. Our
extensive experiments demonstrate that InstructTODS achieves comparable
performance to fully fine-tuned TODS in guiding dialogues to successful
completion without prior knowledge or task-specific data. Furthermore, a
rigorous human evaluation of end-to-end TODS shows that InstructTODS produces
dialogue responses that notably outperform both the gold responses and the
state-of-the-art TODS in terms of helpfulness, informativeness, and humanness.
Moreover, the effectiveness of LLMs in TODS is further supported by our
comprehensive evaluations on TODS subtasks: dialogue state tracking, intent
classification, and response generation. Code and implementations could be
found here https://github.com/WillyHC22/InstructTODS
Contrastive Learning for Inference in Dialogue
Inference, especially those derived from inductive processes, is a crucial
component in our conversation to complement the information implicitly or
explicitly conveyed by a speaker. While recent large language models show
remarkable advances in inference tasks, their performance in inductive
reasoning, where not all information is present in the context, is far behind
deductive reasoning. In this paper, we analyze the behavior of the models based
on the task difficulty defined by the semantic information gap -- which
distinguishes inductive and deductive reasoning (Johnson-Laird, 1988, 1993).
Our analysis reveals that the disparity in information between dialogue
contexts and desired inferences poses a significant challenge to the inductive
inference process. To mitigate this information gap, we investigate a
contrastive learning approach by feeding negative samples. Our experiments
suggest negative samples help models understand what is wrong and improve their
inference generations.Comment: Accepted to EMNLP202
NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages
Democratizing access to natural language processing (NLP) technology is
crucial, especially for underrepresented and extremely low-resource languages.
Previous research has focused on developing labeled and unlabeled corpora for
these languages through online scraping and document translation. While these
methods have proven effective and cost-efficient, we have identified
limitations in the resulting corpora, including a lack of lexical diversity and
cultural relevance to local communities. To address this gap, we conduct a case
study on Indonesian local languages. We compare the effectiveness of online
scraping, human translation, and paragraph writing by native speakers in
constructing datasets. Our findings demonstrate that datasets generated through
paragraph writing by native speakers exhibit superior quality in terms of
lexical diversity and cultural content. In addition, we present the
\datasetname{} benchmark, encompassing 12 underrepresented and extremely
low-resource languages spoken by millions of individuals in Indonesia. Our
empirical experiment results using existing multilingual large language models
conclude the need to extend these models to more underrepresented languages. We
release the NusaWrites dataset at https://github.com/IndoNLP/nusa-writes
NusaCrowd: Open Source Initiative for Indonesian NLP Resources
We present NusaCrowd, a collaborative initiative to collect and unify
existing resources for Indonesian languages, including opening access to
previously non-public resources. Through this initiative, we have brought
together 137 datasets and 118 standardized data loaders. The quality of the
datasets has been assessed manually and automatically, and their value is
demonstrated through multiple experiments. NusaCrowd's data collection enables
the creation of the first zero-shot benchmarks for natural language
understanding and generation in Indonesian and the local languages of
Indonesia. Furthermore, NusaCrowd brings the creation of the first multilingual
automatic speech recognition benchmark in Indonesian and the local languages of
Indonesia. Our work strives to advance natural language processing (NLP)
research for languages that are under-represented despite being widely spoken
GEMv2 : Multilingual NLG benchmarking in a single line of code
Evaluation in machine learning is usually informed by past choices, for example which datasets or metrics to use. This standardization enables the comparison on equal footing using leaderboards, but the evaluation choices become sub-optimal as better alternatives arise. This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims. To make following best model evaluation practices easier, we introduce GEMv2. The new version of the Generation, Evaluation, and Metrics Benchmark introduces a modular infrastructure for dataset, model, and metric developers to benefit from each others work. GEMv2 supports 40 documented datasets in 51 languages. Models for all datasets can be evaluated online and our interactive data card creation and rendering tools make it easier to add new datasets to the living benchmark.Peer reviewe
GEMv2 : Multilingual NLG benchmarking in a single line of code
Evaluation in machine learning is usually informed by past choices, for example which datasets or metrics to use. This standardization enables the comparison on equal footing using leaderboards, but the evaluation choices become sub-optimal as better alternatives arise. This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims. To make following best model evaluation practices easier, we introduce GEMv2. The new version of the Generation, Evaluation, and Metrics Benchmark introduces a modular infrastructure for dataset, model, and metric developers to benefit from each others work. GEMv2 supports 40 documented datasets in 51 languages. Models for all datasets can be evaluated online and our interactive data card creation and rendering tools make it easier to add new datasets to the living benchmark.Peer reviewe